Linear Programming for Large-Scale Markov Decision Problems

نویسندگان

  • Alan Malek
  • Yasin Abbasi-Yadkori
  • Peter L. Bartlett
چکیده

We consider the problem of controlling a Markov decision process (MDP) with a large state space, so as to minimize average cost. Since it is intractable to compete with the optimal policy for large scale problems, we pursue the more modest goal of competing with a low-dimensional family of policies. We use the dual linear programming formulation of the MDP average cost problem, in which the variable is a stationary distribution over state-action pairs, and we consider a neighborhood of a low-dimensional subset of the set of stationary distributions (defined in terms of state-action features) as the comparison class. We propose a technique based on stochastic convex optimization and give bounds that show that the performance of our algorithm approaches the best achievable by any policy in the comparison class. Most importantly, this result depends on the size of the comparison class, but not on the size of the state space. Preliminary experiments show the effectiveness of the proposed algorithm in a queuing application.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Compromise Decision-making Model based on TOPSIS and VIKOR for Solving Multi-objective Large-scale Programming Problems with a Block Angular Structure under Uncertainty

This paper proposes a compromise model, based on a new method, to solve the multi-objective large-scale linear programming (MOLSLP) problems with block angular structure involving fuzzy parameters. The problem involves fuzzy parameters in the objective functions and constraints. In this compromise programming method, two concepts are considered simultaneously. First of them is that the optimal ...

متن کامل

A Compromise Decision-Making Model Based on TOPSIS and VIKOR for Multi-Objective Large- Scale Nonlinear Programming Problems with A Block Angular Structure under Fuzzy Environment

This paper proposes a compromise model, based on a new method, to solve the multiobjectivelarge scale linear programming (MOLSLP) problems with block angular structureinvolving fuzzy parameters. The problem involves fuzzy parameters in the objectivefunctions and constraints. In this compromise programming method, two concepts areconsidered simultaneously. First of them is that the optimal alter...

متن کامل

A Compromise Decision-making Model for Multi-objective Large-scale Programming Problems with a Block Angular Structure under Uncertainty

This paper proposes a compromise model, based on the technique for order preference through similarity ideal solution (TOPSIS) methodology, to solve the multi-objective large-scale linear programming (MOLSLP) problems with block angular structure involving fuzzy parameters. The problem involves fuzzy parameters in the objective functions and constraints. This compromise programming method is ba...

متن کامل

A Non-linear Integer Bi-level Programming Model for Competitive Facility Location of Distribution Centers

The facility location problem is a strategic decision-making for a supply chain, which determines the profitability and sustainability of its components. This paper deals with a scenario where two supply chains, consisting of a producer, a number of distribution centers and several retailers provided with similar products, compete to maintain their market shares by opening new distribution cent...

متن کامل

Efficient Linear Approximations to Stochastic Vehicular Collision-Avoidance Problems

The key components of an intelligent vehicular collision-avoidance system are sensing, evaluation, and decision making. We focus on the latter task of finding (approximately) optimal collision-avoidance control policies, a problem naturally modeled as a Markov decision process. However, standard MDP models scale exponentially with the number of state features, rendering them inept for large-sca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014